Provably Minimally-Distorted Adversarial Examples

نویسندگان

Nicholas Carlini

Guy Katz

Clark Barrett

David L. Dill

چکیده

The ability to deploy neural networks in realworld, safety-critical systems is severely limited by the presence of adversarial examples: slightly perturbed inputs that are misclassified by the network. In recent years, several techniques have been proposed for increasing robustness to adversarial examples — and yet most of these have been quickly shown to be vulnerable to future attacks. For example, over half of the defenses proposed by papers accepted at ICLR 2018 have already been broken. We propose to address this difficulty through formal verification techniques. We show how to construct provably minimally distorted adversarial examples: given an arbitrary neural network and input sample, we can construct adversarial examples which we prove are of minimal distortion. Using this approach, we demonstrate that one of the recent ICLR defense proposals, adversarial retraining, provably succeeds at increasing the distortion required to construct adversarial examples by a factor of 4.2.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Provable defenses against adversarial examples via the convex outer adversarial polytope

We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations (on the training data; for previously unseen examples, the approach will be guaranteed to detect all adversarial examples, though it may flag some non-adversarial examples as well). The basic idea of the approach is to consider a convex outer approximation of the set ...

متن کامل

Adversarial Transformation Networks: Learning to Generate Adversarial Examples

Multiple different approaches of generating adversarial examples have been proposed to attack deep neural networks. These approaches involve either directly computing gradients with respect to the image pixels, or directly solving an optimization on the image pixels. In this work, we present a fundamentally new method for generating adversarial examples that is fast to execute and provides exce...

متن کامل

Crafting Adversarial Examples For Speech Paralinguistics Applications

Computational paralinguistic analysis is increasingly being used in a wide range of applications, including securitysensitive applications such as speaker verification, deceptive speech detection, and medical diagnostics. While state-ofthe-art machine learning techniques, such as deep neural networks, can provide robust and accurate speech analysis, they are susceptible to adversarial attacks. ...

متن کامل

Certifying Some Distributional Robustness with Principled Adversarial Training

Neural networks are vulnerable to adversarial examples and researchers have proposed many heuristic attack and defense mechanisms. We address this problem through the principled lens of distributionally robust optimization, which guarantees performance under adversarial input perturbations. By considering a Lagrangian penalty formulation of perturbing the underlying data distribution in a Wasse...

متن کامل

Verifying Neural Networks with Mixed Integer Programming

Neural networks have demonstrated considerable success in a wide variety of real-world problems. However, the presence of adversarial examples slightly perturbed inputs that are misclassified with high confidence limits our ability to guarantee performance for these networks in safety-critical applications. We demonstrate that, for networks that are piecewise affine (for example, deep networks ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Provably Minimally-Distorted Adversarial Examples

نویسندگان

چکیده

منابع مشابه

Provable defenses against adversarial examples via the convex outer adversarial polytope

Adversarial Transformation Networks: Learning to Generate Adversarial Examples

Crafting Adversarial Examples For Speech Paralinguistics Applications

Certifying Some Distributional Robustness with Principled Adversarial Training

Verifying Neural Networks with Mixed Integer Programming

عنوان ژورنال:

اشتراک گذاری